** Data reading and variable selection from raw data
** Family Survey of the Dutch Population (Family-enquete Nederlandse Bevolking) 2003


** 01. Reading data **
cap log close
clear all
set more off
cd /*insert you work directory here*/
use /*read your data here*/ 


** 02. Consructing year and country variables **

ge year=2003
lab var year "survey year"

ge country=528
lab var country "ISO country code"
//Netherlands: 528 (see "ISO Country Codes.pdf) 


** 03. ID variables **

ge pid=respnr
lab var pid "respondent id"

ge fid=hhnr
lab var fid "family id"


** 04. Basic Demographics (Sex and Age/birth year) **

rename sex asex
recode asex (1=1 "male")(2=2 "female"),into(sex)
lab var sex "sex"

rename leeft age
lab var age "age"

rename byear birthyr
lab var birthyr "year of birth"


** 05. Siblings **

ge nbro=c1
lab var nbro "number of brothers"
ge nsis=c2
lab var nsis "number of sisters"

ge nsibs=nbro+nsis
lab var nsibs "number of siblings"

ge birthorder=1
forvalues i=1/20 {
replace birthorder=birthorder+1 if c5_`i'jr<birthyr
}

lab var birthorder "birth order of the respondent"


** 06. Own education **
//All translated by google
tab educlev
ge educ=educlev
lab var educ "level of education of the respondent"

lab def educ 0 "below elementary school" 1 "elementary school" 2 "lbo, huishoudschool, vbo" 3 "mavo, ulo, mulo" 4 "havo, mms" ///
5 "grammar school" 6 "kort mbo (kmbo)" 7 "volledig mbo" 8 "college/bachelor's degree" 9 "university" 10 "postgraduate"

lab val educ educ


** 07. Parents' education: Father and/or Mother **
//All translated by google
ge faeduc=b14
lab var faeduc "father's highest education obtained"
ge moeduc=b16
lab var moeduc "mother's highest education obtained"
lab val faeduc moeduc educ


** 08. Own occupation **

ge lastocc_ISEI=sei
lab var lastocc_ISEI "respondent's occupation for last occupation in ISEI codes"

ge lastocc_CBS=cbs
lab var lastocc_CBS "respondent's industry for last occupation in CBS codes"

ge lastocc_EGP=egp
lab var lastocc_EGP "class for last occupation in EGP codes"

ge occ_ISEI=cursei
lab var occ_ISEI "respondent's occupation for current occupation in ISEI codes"

ge occ_CBS=curcbs
lab var occ_CBS "respondent's industry for current occupation in CBS codes"

ge occ_EGP=curegp
lab var occ_EGP "class for current occupation in EGP codes"


** 09. Parents' occupation **
//All translated by google
ge facbs15=b18cbs 
lab var facbs15 "father occupation CBS code when the respondent is 15"
ge fasei15=b18sei
lab var fasei15 "father occupation SEI code when the respondent is 15 years"
ge faegp15=b18egp
lab var faegp15 "father occupation EGP code when the respondent is 15 years"

ge macbs15=b30cbs 
lab var macbs15 "mother occupation CBS code when the respondent is 15"
ge masei15=b30sei
lab var masei15 "mother occupation SEI code when the respondent is 15 years"
ge maegp15=b30egp
lab var maegp15 "mother occupation EGP code when the respondent is 15 years"


** 10. Tabulate the Identified Variables **

log using /*insert you work directory here*/, replace text

** Data reading and variable selection from raw data
** Family Survey of the Dutch Population (Family-enquete Nederlandse Bevolking) 2003

** Sex **
tab sex

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs nbro nsis birthorder, d

** R's Own Education **
tab1 educ 

** Parental Education **
tab1 faeduc moeduc 

** R's Own Occupation **
tab1 lastocc_ISEI lastocc_CBS lastocc_EGP occ_ISEI occ_CBS occ_EGP  

** Parental Occupation **
tab1 facbs15 faegp15 fasei15 macbs15 maegp15 masei15


log close

** 11. Keep the identified variables only

keep year country fid sex age birthyr ///
	 nsibs nbro nsis birthorder ///
	 educ faeduc moeduc ///
	 lastocc_ISEI lastocc_CBS lastocc_EGP occ_ISEI occ_CBS occ_EGP /// 
	 facbs15 faegp15 fasei15 macbs15 maegp15 masei15


** 12. Save the Data File **

saveold /*insert you work directory here*/, replace



** 13. Homoginising education **
** Own Education **
rename educ educ_cat

ge educ_yrs=0 if educ_cat==0
replace educ_yrs=6 if educ_cat==1
replace educ_yrs=9 if educ_cat==2
replace educ_yrs=10 if educ_cat==3
replace educ_yrs=11 if educ_cat==4
replace educ_yrs=12 if educ_cat==5
replace educ_yrs=10 if educ_cat==6
replace educ_yrs=10.5 if educ_cat==7
replace educ_yrs=15 if educ_cat==8
replace educ_yrs=17 if educ_cat==9
replace educ_yrs=21 if educ_cat==10
lab var educ_yrs "respondent highest education in years"

ge educ_ISCED=100 if educ_cat==1
replace educ_ISCED=254 if educ_cat==2
replace educ_ISCED=244 if educ_cat==3
replace educ_ISCED=300 if educ_cat==4
replace educ_ISCED=300 if educ_cat==5
replace educ_ISCED=353 if educ_cat==6
replace educ_ISCED=354 if educ_cat==7
replace educ_ISCED=500 if educ_cat==8
replace educ_ISCED=600 if educ_cat==9
replace educ_ISCED=747 if educ_cat==10
lab var educ_ISCED "respondent highest education in ISCED code"

** Parents Education **

ge faeduc_flag=1 

rename faeduc faeduc_cat
rename moeduc maeduc_cat

ge faeduc_yrs=0 if faeduc_cat==0
replace faeduc_yrs=6 if faeduc_cat==1
replace faeduc_yrs=9 if faeduc_cat==2
replace faeduc_yrs=10 if faeduc_cat==3
replace faeduc_yrs=11 if faeduc_cat==4
replace faeduc_yrs=12 if faeduc_cat==5
replace faeduc_yrs=10 if faeduc_cat==6
replace faeduc_yrs=10.5 if faeduc_cat==7
replace faeduc_yrs=15 if faeduc_cat==8
replace faeduc_yrs=17 if faeduc_cat==9
replace faeduc_yrs=21 if faeduc_cat==10
lab var faeduc_yrs "father's education in years"

ge maeduc_yrs=0 if maeduc_cat==0
replace maeduc_yrs=6 if maeduc_cat==1
replace maeduc_yrs=9 if maeduc_cat==2
replace maeduc_yrs=10 if maeduc_cat==3
replace maeduc_yrs=11 if maeduc_cat==4
replace maeduc_yrs=12 if maeduc_cat==5
replace maeduc_yrs=10 if maeduc_cat==6
replace maeduc_yrs=10.5 if maeduc_cat==7
replace maeduc_yrs=15 if maeduc_cat==8
replace maeduc_yrs=17 if maeduc_cat==9
replace maeduc_yrs=21 if maeduc_cat==10
lab var maeduc_yrs "mother's education in years"

ge faeduc_ISCED=100 if faeduc_cat==1
replace faeduc_ISCED=254 if faeduc_cat==2
replace faeduc_ISCED=244 if faeduc_cat==3
replace faeduc_ISCED=300 if faeduc_cat==4
replace faeduc_ISCED=300 if faeduc_cat==5
replace faeduc_ISCED=353 if faeduc_cat==6
replace faeduc_ISCED=354 if faeduc_cat==7
replace faeduc_ISCED=500 if faeduc_cat==8
replace faeduc_ISCED=600 if faeduc_cat==9
replace faeduc_ISCED=747 if faeduc_cat==10
lab var faeduc_ISCED "father highest education in ISCED code"

ge maeduc_ISCED=100 if maeduc_cat==1
replace maeduc_ISCED=254 if maeduc_cat==2
replace maeduc_ISCED=244 if maeduc_cat==3
replace maeduc_ISCED=300 if maeduc_cat==4
replace maeduc_ISCED=300 if maeduc_cat==5
replace maeduc_ISCED=353 if maeduc_cat==6
replace maeduc_ISCED=354 if maeduc_cat==7
replace maeduc_ISCED=500 if maeduc_cat==8
replace maeduc_ISCED=600 if maeduc_cat==9
replace maeduc_ISCED=747 if maeduc_cat==10
lab var maeduc_ISCED "mother highest education in ISCED code"

** 14. Homoginising sibling **
//cutoff
ge nsibs_flag=99
lab var nsibs_flag "cutoff of total number of siblings"
ge nsis_flag=99
lab var nsis_flag "cutoff of number of sisters"
ge nbro_flag=99
lab var nbro_flag "cutoff of number of brothers"

lab def nsib_flag 99 "no cutoff"
lab val nsis_flag nbro_flag nsibs_flag nsib_flag


** 15. Tab Education and Sibling Variables **
tab1 sex age birthyr
tab1 educ_cat educ_yrs faeduc_cat faeduc_yrs maeduc_cat maeduc_yrs faeduc_flag 
tab1 nsis nbro nsibs nsis_flag nbro_flag nsibs_flag


** 16. Save the Data File **

saveold /*insert you work directory here*/, replace
